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We use computer simulations to investigate self-assembly in a system of model chaperonin proteins, and in an 
Ising lattice gas. We discuss the mechanisms responsible for rapid and efficient assembly in these systems, and 
we use measurements of dynamical activity and assembly progress to compare their propensities for kinetic 
trapping. We use the analytic solution of a simple minimal model to illustrate the key features associated 
with such trapping, paying particular attention to the number of ways that particles can misbind. We discuss 
the relevance of our results for the design and control of self-assembly in general. 



I. INTRODUCTION 

In self-assembl}'^^, particles combine spontaneously to 
form structures that can be closed, like capsids^ and 
DNA 'origami SI, or extended, like filaments^, sheets^ ^ 
and unusual crystals^^"^. The possibility of exploiting 
assembly for technological ends has been discussed many 
times^, but to realize this possibility we need to develop 
the ability to predict and control the properties of exper- 
imental self-assembling systems in general. In particular, 
understanding how systems can be designed so as to as- 
semble reliably and rapidly while avoiding kinetic traps 
remains a key challenge. 

Effective dynamical assembly typically requires bond- 
making and bond-breaking events, so that assembling 
particles can avoid long-lived disordered structures and 
form the desired ordered one. The role of transient un- 
binding during self-assembly is understood at a qualita- 
tive levepEMiE particles on the micro- and nanoscale 
can exploit thermal fluctuations in order to sample a 
range of bound configurations as structures grow. Such 
fluctuations allow particles to break local bonds and es- 
cape the kinetic 'traps' that result when misbound parti- 
cles become frozen into place by the arrival of more ma- 
terial. The importance of such fluctuations is apparent 
from measurements, in computer simulations, of assem- 
bly yield as a function of particle binding strength. Typ- 
ically, such curves are non-monotonic, with a decrease in 
yield at large bindin g strength d ue to the suppression of 
bond-breaking event ^^f^^^^S^nilll (^ggg Yig. [l]). However, 
while the roles of fluctuations and transient unbinding 
are clear at this qualitative level, it is not clear 'how 
much' reversibility is required for effective self-assembly 
in a given system. 

Here we address this question. We introduce a toy 
model of assembly whose analytic solution demonstrates 
a minimal set of requirements for kinetic trapping. We 
also consider computer simulations of two models of in- 
teracting particles. The first is an off-lattice, coarse- 
grained model^^ of 'chaperonin' proteins from which 
filament-like and sheet-like structures can assemble. The 
second is the two-dimensional lattice gas, whose separa- 
tion into dense and dilute phases ex hibits many of the 
characteristic features of self-assembl}^^. We discuss 
the assembly mechanisms in these models, and in par- 



ticular identify whether assembly is more efficient when 
a single structure forms by nucleation and growth, or 
when multiple structures form simultaneously. We then 
consider the role of thermal fluctuations, comparing mea- 
surements of dynamical activit}/^^ with the flux to- 
wards the assembled state. For example, as chaperonin 
particles assemble into a close-packed sheet, they typi- 
cally bind and unbind hundreds or thousands of times 
before attaining their final positions. We find that both 
the mechanism of assembly and the dynamical activity 
indicate the effectiveness of a system in avoiding (or es- 
caping from) kinetic traps, and we discuss the relevance 
of these results for the design of self-assembling systems. 



II. MODELS AND ASSEMBLY YIELDS 
A. General considerations 

A key aim of this article is to identify features that are 
conserved between different self-assembling systems. To 
this end, we show results for three model systems, em- 
phasising their common features as well as some salient 
differences. In all cases we initialise interacting particles 
in disordered configurations and they evolve with diffu- 
sive dynamics towards low-energy thermally-equilibrated 
structures. For example, we will consider model chaper- 
onin proteins that assemble into extended close-packed 
sheets (full details are given in Sec. II B). We define the 
'yield' of this assembly process to be the fraction of par- 
ticles embedded in such close-packed sheets. To facilitate 
comparison between systems, we consistently use eb/T to 
denote a dimensionless measure of the strength of inter- 
particle bonds; we also use nopt to denote the assembly 
yield, defined as the fraction of particles that are in 'op- 
timal' bonding environments. 

Fig. [TJa) shows results for the sheet-forming chap- 
eronin system and Fig. [ijb) shows results for a two- 
dimensional lattice model where particles assemble into 
large close-packed clusters (see Sec. II C for full details). 
For these two systems, on these time scales, nopt is large 
only in a narrow range of bond strength. When bonds 
are too weak, the assembled structure is not stable; when 
bonds are too strong, the system is vulnerable to kinetic 
trapping. We contrast this behavior with that of a dif- 
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FIG. 1. Assembly yield (riopt) versus binding strength Ch/T^ 
for various times and for equilibrated systems. We show 
representative snapshots of clusters at long times, for bond 
strengths indicated, (a) Sheet-forming chaperonin system 
(cr = 0.3), (b) lattice gas, (c) filament-forming chaperonin 
system. In cases (a) and (b), dynamic yields at fixed time 
are non-monotonic in binding strength Cb/T; in (c), yield is 
monotonic, reflecting the absence of kinetic trapping. Data 
marked 'long' are taken from simulations lasting 300 hours of 
CPU time, rather than a fixed final time t. 



ferent model of chaperonin proteins which assemble into 
long filaments. Fig.[TJc) shows that this process does not 
suffer kinetic trapping even when bonds are very strong: 
the yield is monotonic in eb/T. 



B. Chaperonin model 

Chaperonin protein^^ assemble in vitro into a range 
of structures that include extended two-dimensional 
shee ts an d quasi one-dimensional filaments. Following 
j^g£g [221251^ we model chaperonins as hard spheres of di- 
ameter 2a equipped with orientation-dependent pairwise 
interactions that encourage either equator-to-equator or 
pole-to-pole binding (see Appendix [B|). The anisotropic 
interactions have range a/4 and are characterized by a 
dimensionless bond strength eb/T. They also depend on 
a parameter a that determines how precisely two chap- 
eronins must align before they receive an energetic re- 
ward: the smaller is a, the more specific is the angu- 
lar interaction. We simulated TV = 1000 chaperonins 
in periodically-replicated cubic boxes of side L. Chaper- 
onins were present at a concentration of 0.82% by volume 
(i.e. |7V7r(a/L)3 = 0.0082). 

In sheet-forming systems, particle interactions pro- 
mote equator-to-equator binding. We focus on a system 
with angular specificity parameter a = 0.3 (a fairly strict 
alignment criterion). We also contrast the behaviour of 
this system with that of a system possessing angular 
specificity parameter cr = 0.7 (a more generous alignment 
criterion). At large bond strengths, equilibrium config- 
urations of these systems contain a large close-packed 
planar sheet; for weak bonds the equilibrium is a dilute 
gas of free particles or small clusters. At the low concen- 
trations studied, these systems do not form liquid phases 
or three-dimensional crystals. 

We also considered a filament-forming system whose 
interactions favour pole-to-pole binding. Its equilibrium 
state for large binding strength is a collection of long 
filaments. 

For concreteness, we have selected particular values 
for parameters such as the specificity a and the volume 
fraction. Although there is a degree of arbitrariness in 
the particular values chosen, we find that the qualitative 
behaviour of the systems we consider here varies only 
weakly if we vary model parameters over a wide range 
of values. For instance, we do not find regimes in which 
the yields of chaperonin sheet formers or the lattice gas 
(Fig. [T]) vary monotonically with binding strength. In- 
deed, Fig. [l] shows that these systems exhibit similar 
qualitative trends, despite their differences in dimension, 
packing fraction, and the microscopic detail of their inter- 
actions. Similar behaviour has been observed in a range 
of other self- assembling systemJ^^^^HI^. We are therefore 
confident that our results are relevant for a range of self- 
assembling model systems; we would also expect similar 
phenomenology to be reproduced in experiments. 

We performed dynamic simulations, starting from 
well- mixed configurations, using the virtual- move Monte 
Carlo (MC) algorithn:P described in Ref.^. This algo- 
rithm approximates a diffusive dynamics by using po- 
tential energy gradients to generate both single-particle- 
and collective translations and rotations. We define tb 
as the mean time taken for an isolated particle to diffuse 



3 




(&) 



El = -eb/2 

States I .-'misbound 
particles' 



-/3eb 



State 2: 'optimally 
bound particles' ^ 

E2 = — eb 



M = 10 



M = 




5 / 10 



5 /rr. 10 

eh/T 



FIG. 2. Analytic toy model of assembly demonstrating the requirements for kinetic trapping, (a) Particles transfer between 
the 'monomer', 'misbound' and 'optimally bound' levels with the rates shown; Ch is the particle binding strength; c is a 
concentration variable (set to 10~^ in the other panels); and M is the number of ways of misbinding. (b) When there exists 
the possibility of misbinding (M > 0), the dynamic yield is non-monotonic with Cb, because as Ch increases 1) equilibrium yield 
increases but 2) the escape rate from misbound states decreases, (c) When misbinding is not possible (M = 0), dynamic yield 
increases with binding strength. Similar behaviour is seen in computer models in Fig. [l] 



a length equal to its diameter (150 MC steps in our sim- 
ulations). For later convenience we define to = 10^ MC 
steps ^ GTOtB' For the sheet-forming systems we sam- 
pled thermal equilibrium by starting from a large close- 
packed sheet inserted into a gas of monomers, and using 
local Monte Carlo moves supplemented by the nonlocal 
algorithm described in Ref.^^. 

To define the yield nopt, we consider two particles i 
and j to be neighbours if their interaction energy Eij < 
—2T. For the sheet-forming model the optimal number of 
neighbours is A/'max = 6; for the filament-forming model 
A/'max = 2. The yield nopt is the fraction of particles with 
this number of neighbours. We also define a normalised 
energy ('fraction of possible bonds') 



2E 



(1) 



where E is the total energy of the system. Thus, nb = 
if no bonds are present, while nb = 1 if all particles are 
in optimal binding environments. 

The results shown in Fig [T] illustrate that the sheet- 
forming model suffers from kinetic trapping when eb/T 
is large, so that good assembly occurs only in an inter- 
mediate range of bond strengths^^. On the other hand, 
growing filaments in this model cannot become kineti- 
cally trapped: each particle can bind only at its north 
or south pole, and each of those two modes of binding 
permits the structure to be extended in an orderly man- 
ner. In this case, thermal fluctuations do not facilitate 
assembly, but instead break up long filaments and reduce 
yield. We note that assembly of filaments may still suffer 
from kinetic trapping if they have more internal structure 
than the simple strings of particles considered here^ 



C. Lattice gas 

We also consider the two-dimensional lattice gas, com- 
prising N particles on a square lattice of V = sites. 
Particles on nearest-neighbouring sites form bonds of 
energy — eb; particles may not overlap. The system 
phase-separates when bonds are strong, forming dense 
(liquid) and dilute (gas) phases. We work at density 
p = N/V = 0.002 for which the onset of phase separation 
(binodal) is at eb/T = 3.i^. We take L = 2048 through- 
out. Motivated by the characteristic non-monotonic yield 
shown in Fig. [ijb), we draw an analogy between this 
phase separation and the self-assembly observed in the 
chaperonin model^^ In the limit of large eb/T we ob- 
serve diffusion-limited cluster aggregation^^, an example 
of kinetic trapping that frustrates phase separation. 

We again used an MC scheme with cluster moves in 
order to simulate the dynamics of Brownian particles dis- 
persed in a solvent. Our scheme is a variant of the 'cleav- 
ing' algorithm of Ref.^^: In each MC move we select a 
seed particle, and begin to grow a cluster by adding to 
the seed, with probability Pc = I — e~^^^^^ ^ each of its 
neighbouring particles. Here A = 0.9 is a parameter that 
controls the relative likelihood of moving single particles 
as opposed to whole clusters. This process of adding par- 
ticles to the cluster is repeated recursively until no more 
particles are added. We then attempt to move the result- 
ing cluster in a random direction. We reject any moves 
that would lead to more than one particle on any site; 
otherwise we calculate the energy difference AE between 
the original and proposed configurations. The cluster is 
moved with probability Pm = Pa/n'^ where n is the size 
of the cluster and p^ = min(l, e-^^"^)^^/^) if AE ^ 0. 
When AE = we take pg, = a with a = 0.9. The factors 
Pa 7 Pm and Pc together ensure that the dynamics obey 
detailed balance and that clusters of n particles diffuse 
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with a rate proportional to 1/n. The parameters a and 
A are chosen for computational efficiency, and for con- 
sistency with our other studies of this modeP^. An MC 
sweep comprises MC moves. The Brownian time for 
an isolated particle is tb ^ 1.11 MC sweep. Equilib- 
rium conditions were probed using simulations that were 
initialised with a large assembled cluster. 

Particles are considered to be neighbours if they are 
on adjacent lattice sites. The optimal number of neigh- 
bours i s A^m ax = 4, allowing nopt and nb to be defined as 
in Sec. II B Fig. [TJb) shows that the assembly yield of 
this simple two-dimensional lattice model is qualitatively 
similar to that of the sheet-forming chaperonin model. In 
what follows, we use comparisons between these systems 
to identify which assembly properties may be generalised 
between models, and which are model-dependent. 



D. Schematic model of assembly and kinetic trapping 

To illustrate the physical origins of the behaviour in 
Fig. [l] we introduce a toy model of self-assembly. We 
consider a large number of particles, each of which can 
inhabit any of three energy levels: a 'monomer' level of 
energy 0, a 'misbound' level of energy — 6b/2, and an 
'optimally bound' level of energy — see Fig. |2|a). 
Particles begin in the monomer level, and transfer into 
the bound levels with the displayed rates. Here c is a 
concentration-like variable, and M is the degeneracy of 
the misbound level, which refiects the number of ways a 
particle can misbind. Particles escape from bound states 
with the Arrhenius-like rates shown. 

Denoting the unbound, misbound and optimally bound 
states by 0, 1, 2 respectively, the model is described by a 
master equation 



dt 



P{t) = WP{t), 



(2) 



where P{t) = (Pq A W); the variable Pi{t) is 
the probability that a particle resides in state i at time 
t: and the matrix W is 



-c(M + l) a 



W 



cM 
c 







(3) 



We have defined a = e~^b/2T compactness of notation 
and we take Boltzmann's constant /cb = 1 throughout 
this paper. The yield in this model is nopt = ^2- 

All particles start in the monomer state, so that 
Eq. ([2| is to be solved with the initial condition P{0) — 
(1,0,0). The solution is obtained by matrix diagonal- 
isation; details are given in Appendix [Aj In the long- 
time limit. Pit) converges to the equilibrium distribution 
s = ^ (a^, cMa, c) where Z = c-\- cMa + a'^ is the parti- 
tion function. Thus the equilibrium (long-time) yield is 

^eq = c/(c + cMa + a^). 



Dynamic yields are shown in Fig.[2|b,c). The long-time 
yield neq increases as particle binding strength Cb/T in- 
creases. However, the escape rate a from the misbound 
state decreases as eb/T increases, so that misbound parti- 
cles take a long time to unbind and transfer to the bound 
state. As long as M > 0, these two confiicting effects 
result in a yield nopt that at finite times decreases for 
large binding strength (Fig.[2|b)). We show in Appendix 
[Ajthat if a is small then reaching the equilibrium yield 
takes a time of order (M + l)/a. However, if M = 0, 
i.e. there is no possibility of binding in a non-productive 
manner, then yield increases monotonically with binding 
strength (Fig.j^Jc)). 

When M > the toy model reproduces the quali- 
tative dependence of yield on time and bond strength 
shown in Figs.^a,b). On the other hand, the behaviour 
shown in Fig. mc) is reproduced by the toy model when 
M = 0. We see immediately the three requirements for a 
dynamic yield that is non-monotonic in particle binding 
strength: 1) equilibrium yield increases with increasing 
binding strength; 2) there exists the possibility of mis- 
binding; and 3) the escape rate from misbound states 
decreases with increasing binding strength. 



III. ASSEMBLY MECHANISMS 

We collate information about assembly mechanisms at 
different state points and in different models by plot- 
ting in Fig. |3] the normalized energy nb against the 
normalized number of optimally-bound particles nopt 
(which amounts to using energy as a measure of assem- 
bly progress). If the system contains large clusters of 
optimally bound particles then one expects nopt ^ ^b- 
However, particles on the cluster surfaces contribute to 
nb but not to nopt so one finds in general that nopt < ^b- 
For fixed nb the difference nb — nopt is smallest when the 
system contains one large cluster, which has relatively 
few surface particles. For kinetically trapped states one 
typically finds nopt <^ ^b, because few particles are in 
optimal environments. 

In the chaperonin system there is a pronounced nucle- 
ation regime in which assembly proceeds by growth of 
a single large cluster. Since nucleation is a rare event, 
this regime is characterised by system-wide fiuctuations. 
However, the assembly mechanism does not fiuctuate, 
but is the same for all trajectories: a single sheet grows 
from a gas of particles (evidence for this assertion is given 
in Appendix [C|). In the parametric plots of Fig.jsj this be- 
comes clearest when we plot (nopt)nb7 the assembly yield 
from multiple trajectories averaged over configurations 
with a given value of nb- For the lattice gas system, the 
free energy barrier to nucleation is smaller, fluctuations 
between trajectories are less pronounced, and it is ap- 
propriate to take time as a parametric variable. We plot 
quantities averaged at constant time, (nopt(^)) against 
{^h{t))- We also show isochrones, lines connecting points 
of equal time (lattice gas), or points of equal average time 
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FIG. 3. Assembly quality riopt versus progress rib for (a) the chaperonin sheet-forming system (cr = 0.3), (b) the lattice gas 
(isochrones are at 10^, 10^ and 10^ MC steps), and (c) the chaperonin filament-forming system. We show data for a range of 
bond strengths e^/T^ as indicated. Time advances from bottom left to top right: dotted lines of constant time (isochrones) 
are drawn. In (a) the straight lines for the two highest temperatures indicate that assembly corresponds to the nucleation and 
growth of a single sheet; as temperature is lowered, multiple nucleation events are seen, and curves bend away from this line. 
Peak yield at long time is obtained (at Ch/T ^ 6.8) slightly away from the single-sheet nucleation regime (peak yield is obtained 
even further from this regime for a sheet-forming system with a more generous angular binding criterion: see Fig.lIJc)). Similar 
behaviour is seen in (b), although the nucleation regime is less pronounced. In (c), lowering temperature changes the assembly 
mechanism only slightly, and maximal yield at fixed time is always obtained for the lowest temperature considered. 



(chaperonin systems). 

Fig. [3] allows us to draw several conclusions about the 
assembly mechanism in these systems. In Fig. |3ja) the 
nearly-straight lines at the two highest temperatures in- 
dicate that assembly corresponds to the nucleation and 
growth of a single sheet. As temperature is lowered, mul- 
tiple nucleation events are seen, and curves bend away 
from this line. (Since there are multiple growing sheets, 
the fraction of bound particles located on cluster surfaces 
is larger, and nopt/^b is lower, than in the single-sheet 
regime.) The maximal yield riopt at long times is ob- 
tained at eb/T ^ 6.8, slightly away from the single-sheet 
nucleation regime. That is, while the ratio of surface 
to bulk particles is optimal in the single-sheet regime, 
the total number of assembled particles increases with 
Cb such that the yield continues to increase even as the 
surface-to-bulk ratio starts to fall. This competition be- 
tween quality and quantity of assembled product was re- 
cently discussed in Ref.^^. Further from the single-sheet 
nucleation regime the surface-to-bulk effect dominates, 
and yield begins to decline. For very strong bonds, clus- 
ters become ramified, as illustrated in Fig. [TJ and yield is 
small. We note in passing that optimal assembly regime 
seems to take place near the spinodal line for phase sep- 
aration, since it is associated with a nucleation barrier 
that is just small enough for nucleation to cease to be a 
rare event - the possibility of controlling the nucleation 
barrier to achieve optimal assembly was discussed irP^. 

In Fig. [sj^b), we show data for the lattice gas model, 
which behaves similarly to the sheet-forming chaper- 
onins: maximal yield is obtained in a regime in which 
many clusters grow simultaneously, but too strong an 
interaction again impairs assembly. By contrast, the 
assembly mechanism in the filament-forming system is 



largely insensitive to bond strength: the main effect of 
increasing eb/T is that the system makes more progress 
along the reaction coordinate (Fig. |3jc)). This again re- 
flects the low propensity for kinetic trapping in this sys- 
tem. 

Finally, we note that despite their different spatial di- 
mensionality and binding geometry, the sheet-forming 
chaperonin model and the lattice gas show similar be- 
haviour in the representations of Figs. [T] and |3] In both 
cases, assembly can take place through the nucleation 
and growth of a single structure, but optimal yield oc- 
curs in the regime in which several clusters (sheets) grow 
simultaneously (see also^^). In Fig. [i] we show data for a 
chaperonin sheet-forming system with an angular binding 
specificity (a = 0.7) more generous than that (<j = 0.3) 
studied in Figs, [l] and [s] In particular, Fig,|4]^b) indicates 
that two intermediate-sized sheets may coalesce and heal 
into a single larger close-packed sheet. This healing in- 
dicates that particles can escape kinetic traps. In Sec. |V| 
we discuss this effect in the context of assembly 'forgiv- 
ingness', the ability to recover an ordered product from 
a disordered intermediate state. 



IV. REVERSIBILITY OF BINDING 

A. Everything put together (well) falls apart (transiently): 
Statistics of bond-breaking and bond-making 

As we have discussed (see e.g. Fig. |2|, non-monotonic 
yields such as those shown in Fig. [l] occur because assem- 
bling particles must break bonds that are not compatible 
with the final ordered structur^^. In Fig. [s] we show the 
scaled energy Si = Ei/e\^ of each of 5 randomly-chosen 
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FIG. 4. Data for a chaperonin sheet-forming system with angular binding specificity a = 0.7 (less specific than the system in 
Figs [l] and [3]). (a) Yield; (b) fixed-time snapshots; (c) assembly mechanism. While this system's behaviour is broadly similar to 
that shown in Figs.[TJa) andjsj^a), some details of its assembly are different. Notably, sheets can fuse and heal (b), eliminating 
joins formed by collisions and allowing particles to acquire optimal binding environments. As a result, peak yield (c) is found 
further from the regime of single-sheet nucleation than in Fig.jsja): this system can tolerate deeper supercooling than the one 
with a — 0.3. 



chaperonin sheet-formers as a function of the time for 
two different bond strengths. We show similar data for 
the lattice gas system. It is clear that assembling par- 
ticles bind and unbind, and unbind more readily at the 
weaker bond strength. However, despite the clear link be- 
tween bond-breaking events and good assembly, we pos- 
sess little understanding of how many bond breakings are 
required in order to maintain effective assembly. 

To investigate this, we recorded for each particle the 
number of bound neighbours Moid it possessed before each 
accepted MC move, and the number of bound neigh- 
bours A^new it possessed after each accepted MC move. 
If A^new > AToid then we count a binding event for this 
particle; if Mnew < Moid then we count an unbinding 
evenl^. If A/'new = Moid then we assume that nothing 
happened to this particle (it might have gained and lost 
neighbours in equal number, but this happens so rarely 
in our simulations that we ignore it). We write K± to 
represent the total number of binding/unbinding events 
in a given time window of an assembly trajectory. We use 
these counts of binding and unbinding events to measure 
reversibility by separating them into time-reversal sym- 
metric and asymmetric measures. That is, averaging the 
numbers of events between times and t, we define the 
traffic (or dynamical activit}'^^ 



as 



and the flux as 



m 



{K+) + (K.) 
N 



N 



(4) 



(5) 



Traffic measures the total number of events per particle; 
flux measures the excess of binding over unbinding events 
per particle, and is a measure of the extent to which 
time-reversal symmetry is broken in the system. For an 
equilibrated system (which is time-reversal symmetric), 
we have T{t) = and T{t) ex t. For a system in which 
bonds never break, we have J^{t) = T{t). 

We show typical results in Fig. [6j The maximal possi- 
ble flux in a system is approximately ATmax- flux increases 
in time in a similar way to (nb(t)), because it quantifles 
the number of bonds in the system. The flux therefore 
saturates at long times, while the traffic continues to in- 
crease (events continue to happen in the system even 
after it has equilibrated in the assembled state). 



B. Two steps forwards, one step back: quantifying 
reversibility 

Close inspection of Fig. |6] reveals that under optimal 
assembly conditions in the sheet-forming system, par- 
ticles eventually form on average about 5.5 bonds, but 
participate in about 4000 binding events (and so around 
3994.5 unbinding events). Lattice gas particles typically 
participate in approximately 2500 binding and 2497 un- 
binding events in order to achieve a net gain of 3 bonds. 
In the filament-forming system, at the lowest temper- 
ature probed, particles participate in fewer than two 
events per bond formed. No reversibility is required in 
this case, because no misbinding can happen. 

To interpret these results, it is useful to return to the 



7 



Sheet- forming chaperonins 

(a) eb/T = 6.8 (b) eb/T : 




5.0x10^ 1.0x10^ 



FIG. 5. (a,b) Scaled energy 8i for each of 5 randomly- chosen 
sheet-forming chaperonin particles as a function of time t, 
for two different bond strengths and a = 0.3. (c,d) Similar 
data for lattice gas particles. Assembling particles bind and 
unbind, with unbinding being more frequent when bonds are 
weaker. The range of times shown is such that substantial 
assembly has occurred by the end of all trajectories. 



toy model defined in Sec. IIP We assume that reaching 
the 'optimally bound' state results in 2 binding events, 
and reaching the 'misbound' state results in 1 event 
(the idea is that optimally-bound particles typically have 
A^max neighbours while misbound ones have fewer than 
ATmax; we take AT^ax 2). 

Assuming that eb/T is large, it is useful to work at 
leading order ina = e"^^/^^^). The analysis is performed 
in Appendix [Aj here we summarise the main results. In 
the limit of small a (and assuming M > 0), the toy model 
approaches the assembled state as nopt ~ 1 — e~'^^^ with 
r ^ (M + l)/a. There is a broad time window r <^t <^ 
in which T{t) ^ 2 and T{t) ^ 2(M + 1). Making a 
parametric plot of flux against traffic as in Fig.[7|^a), one 
observes that for large eb, the traffic plays the role of a 
clock, with the system reaching the assembled state when 
r{t) ^ 2{M -h 1) (and r{t)/J='{t) = M + 1). [If the limit 
of large Cb has not yet been reached, the system reaches 
equilibrium at a value of T{t) larger than 2(M + 1).] 

We show parametric plots of flux and traffic for the 
sheet-forming chaperonin model and the lattice gas in 
Figs.[7]^b,c). At a fixed value of traffic, flux is an increas- 
ing function of eb, reflecting the role of eb as a driving 
force towards the assembled state. But at fixed time, 
traffic is a decreasing function of eb, reflecting the role of 
eb in the activation energy for escaping from misbound 
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(a)^(t) (b)r(t) 




10"^ 10"^ 



Lattice gas 

(c)J-(t) 
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FIG. 6. (a,b) Flux and traffic measurements for the sheet- 
forming chaperonin system {a = 0.3); (c,d) Similar data for 
the lattice gas. At fixed time, fiux is non-monotonic in eb/T 
(compare yield in Fig.[T]); but traffic decreases with increasing 
Ch/T, due to the role of the bond strength as an activation 
energy for bond-breaking. 



states. 

The parametric plots of Fig. [7] allow comparison of 
reversibility of assembly between different systems. By 
comparison with the schematic model, we use this data 
to quantify systems' propensity for kinetic trapping, as 
follows. We calculate the ratio M{t) = T{t)/J^{t)\ un- 
der optimal assembly conditions in the schematic model 
(large eb/T and long time t) then M(t) approaches M+1, 
the ratio of the number of misbound and optimally bound 
states. Given simulations of fixed length t but varying 
eb/T, we define a parameter Meff(t) by evaluating M(t) 
in the system with optimal eb/T. In the schematic model, 
Meff (t) ^ M + 1 as long as substantial assembly occurs 
before time t for at least one value of eb/T. 

For our computer models, we obtain order-of- 
magnitude estimates of Meff as follows. For the sheet- 
forming chaperonins and times in the range 20 — lOOto, 
optimal assembly is in the range 7 < eb/T < 7.25. 
The flux is ~ 5 while the traffic is in the range 
2000 <T < 7000. We infer that Meff lies in the range 
400 — 1500. For the lattice gas model and times in the 
range 10^ — 10^ MC sweeps, optimal assembly is in the 
range 5 < eb/T < 5.7, the fiux is ~ 3 and the traffic 
in the range 500 - 5000; the range for Meff is 200 - 2000. 
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(a) Schematic model 




(b) Sheet- forming chaperonins 




10"^ 10"^ 10° 10^ 10^ 10^ lO"^ 10^ 10^ 

m 

FIG. 7. Parametric plots of flux and traffic during assembly, 
showing the role of traffic as a system clock, (a) Schematic 
M-state model with M = 10 and c = 0.01. For large Cb/T, as- 
sembly is complete after approximately 2(M + 1) = 22 events, 
(b) Sheet-forming chaperonin model {a = 0.3) with isochrones 
at the indicated times, (c) Lattice gas model (isochrones are 
at 10^, 10^ and 10^ MC steps). 



Given the large overlap in estimates of Meff for lattice gas 
and chaperonin systems, we conclude that the propensity 
for kinetic trapping in these two models are quite simi- 
lar. For the sheet-forming chaperonins with a = 0.7 (see 
Fig. [4| and taking time 20 — lOOto we obtain a range 
for Meff of 800 — 4000, systematically larger than the 
value for the sheet-forming chaperonins with a = 0.3. It 
may be that the less specific interaction potential offers 



more possibilities for disordered states, so that the sys- 
tem requires more unbonding events in order to reach 
a final ordered structure. For filament-forming chaper- 
onins, optimal assembly occurs at very large Cb/T, for 
which T ^ T and hence Meff = 1 (the analogous toy 
model has M = Meff — 1 = 0, as expected since there is 
no possibility for misbinding). 

We have emphasised the large uncertainties in the pa- 
rameter Meffi the model of Sec. |IID| is a toy model of 
assembly, and one should not expect a direct mapping to 
more detailed computer models. For example, the values 
we obtain for Meff depend on the method used to identify 
neighbouring particles in the chaperonin model, and on 
the time at which flux and traffic are measured. Physi- 
cally, the structures of the misbound states that cause ki- 
netic trapping vary with time as assembly takes place, so 
describing these states with a single number Meff is sim- 
plistic. Nevertheless, we argue that the parameter Meff 
which we extract provides a useful estimate of the im- 
portance of kinetic trapping in these assembling systems. 
Comparison of the values of Meff emphasises the differ- 
ence between sheet-forming and filament-forming chap- 
eronins. On the other hand, the difference between the 
sheet-forming chaperonins and the assembling lattice gas 
model is very small, especially given the inherent uncer- 
tainties in estimating Meff . 

In terms of effectiveness of assembly, we draw two main 
conclusions from the toy model. Firstly, the time taken to 
equilibrate depends strongly on the activation barrier for 
escape from misbound states, and is r ~ (M + l)e~^^/^-^. 
Thus, assembly is most rapid if the system possesses rel- 
atively weak bonds. Secondly, the number of unbinding 
events required to arrive at the assembled product de- 
pends on the number of misbound states: this number 
reflects a system's propensity for trapping, and minimis- 
ing M provides a method for increasing assembly quality. 
Practical design rules for minimisation of M remain an 
outstanding problem, but tuning the specificity of inter- 
particle attractions^ might provide a route to minimiz- 
ing this parameter. 



V. OUTLOOK 

Based on the analysis of this article, we draw two main 
conclusions. In Section IIIII we showed that the assem- 
bly mechanism assumed by classical nucleation theory 
(CNT), consisting of the growth of an isolated, com- 
pact cluster, typically operates when bonds are relatively 
weak. As bonds get stronger t his sim ple picture no longer 
holds: multiple clusters giovf^^^^, and for very strong 
bonds cluster structures become ramified. We find that 
the competition between quality and quantity of assem- 
h\y^^ results in optimal assembly happening away from 
the 'CNT regime'. The extent to which this happens 
depends on the design of inter-component interactions 
(compare the sheet-forming systems with angular speci- 
ficity (T = 0.7 (Fig. |4| with the data for a = 0.3 shown in 
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the other Figures). 

In Section [IV| we demonstrated the importance to self- 
assembly of the reversibility of binding. In models in 
which kinetic trapping is important, particles bind and 
unbind hundreds or thousands of times before finally 
adopting their final positions in the assembled super- 
structure. We associate the ratio of traffic and fiux under 
conditions of optimal assembly with a parameter Meff 
that counts degeneracy of misbound states. Large values 
of Meff indicate that a system is prone to kinetic trap- 
ping; a system's bonds must be relatively weak in order 
to avoid such trapping. 

These conclusions reinforce the importance of anneal- 
ing if kinetic trapping is to be avoided. If departures from 
CNT at optimal assembly are large, then the system is 
effective in annealing disordered clusters into well-formed 
products. Similarly, if Meff is small, the system requires 
relatively few unbinding events in order to arrive at an 
assembled product. Both these measurements reffect the 
'forgivingness' of assembly, by which we mean the ability 
of a particles to escape from kinetic traps and form an 
assembled product. We believe that guidelines for im- 
proving forgivingness are potentially useful in the design 
of self-assembly in general. For the chaperonin sheet- 
formers that we considered, we found that the version 
with reduced angular specificity seems to be the more 
forgiving of the two. Similarly, crystallisation tends to 
be most forgiving when interactions are relatively long- 
ranged; short-ranged interactions more frequently lead 
to gelation or other forms of kinetic trapping^^. Further 
simulation studies are needed in order to clarify the im- 
portance of microscopic parameters to the 'forgivingness' 
of self-assembly, and to assess how typical numbers for 
'fiux' and 'traffic' compare to those seen in the model 
systems studied here. Ultimately, however, application 
of the ideas developed here requires the development of 
experimental systems in which the microscopic reversibil- 
ity of self-assembling components can be quantified. 
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Appendix A: Minimal model of kinetic trapping 

In th is ap pendix, we analyse the toy model introduced 
in Sec. II D The master equation ([2| can be solved ex- 



actly by matrix diagonalization: we write W = SDS~^ 
where D is a diagonal matrix. The columns of S are the 
right eigenvectors of W. The solution is then P{t) = 
Se^^S~^P{0). There is a zero eigenvalue of W that 
corresponds to the steady state: we denote the other 
two eigenvalues by — A+ and — A_ which are ordered as 
< A_ < A+. 



1. Assembly yield 

The yield of assembly is nopt(^) = P2{t)- The right 
eigenvector of W corresponding to the equilibrium state 
is s = ^(a^,cMa,c) where Z = c-\-cMa-\-a'^ is the par- 
tition function. Thus the equilibrium (long-time) yield is 
^ea = — — 2 while for general t the solution is of the 
form 



^opt 



(t) 



-X-t] 



(Al) 



where a and b are (positive) constants that depend on a, 
c, and M, subject to a + 6 = 1. 

To gain physical intuition, it is convenient to assume 
that a is small. In this case, we have A+ = c{M + 1) + 
0{a) while A_ = ^^^-^^^ +0(a^). Physically, the system 
forms bonds quickly (with rate A+), arriving in a state in 



which P2 



and Pi 



M 



There is then a slow re- 



M+l "'^^^ ^ i — 

laxation (with rate A_ <C 1) in which P2 increases to the 
value neq ~ 1. [Here and in the following, we use approx- 
imate equalities to indicate that there are corrections at 
0(a).] The slow relaxation to equilibrium involves parti- 
cles escaping from the misbound energy level, and there- 
fore has an activated rate A_ ~ e~^^^'^^ . This gives rise 
to the non-monotonic yield plot shown in Fig. [2|b). 

When there is no possibility of misbinding (i.e. when 
M = 0), the previous analysis holds but b = in (Al); 



the slow stage of relaxation is irrelevant for the yield. 
In this case, yield curves are monotonic with Cb/T; see 
Fig-ic). 



2. Flux and traffic 

To obtain time-averaged flux and traffic in this model, 
we notice that the average number of transitions from 
state 1 to state between times and t is Kiq = 
a J^dt^ Pi{t')i with similar results for transitions between 
other states. For a full analysis of the statistics of the 
number of transitions between states in Markov pro- 
cesses, see— If we assume that transitions between 
states and 1 involve the making (or breaking) of one 
bond while transitions between states and 2 involve 
making or breaking of two bonds, we arrive at expres- 
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sions for the traffic and flux: 

T{t) = I At' [c(M + 2)Po(i') + aPi(t') + 2a^P2{t% 
Jo 

J^(t) = / dt' [-c(M + 2)Po{t') + aPiit') + 2a^P2{t')]. 
Jo 

(A2) 

For the initial conditions used here, it may be readily 
shown from ^ that T{t) = Pi{t) + 2P2(t), as required. 

Using the solution P{t) given above and performing 
the time integral, one arrives at 

T{t) = + k_fS-^D,^t{t)SP{0), 

where k- = {0,a,2a^), k+ = (c(M + 2),0,0) and AntW 
is a diagonal matrix with elements (t, (1— e~'^-^)/A_, (1 — 



)/A+). 



We again analyse the limit of small a. For large times 
(A_t 1) the flux saturates at 2 + 0(a) while the traffic 
is 



T{t) ^ 2(M + 1) + 2(M + 2)a^t. 



(A3) 



For large enough times, the second term dominates and 
the traffic increases linearly with time, but if <C 
t <C then traffic saturates at 2(M + 1). This is the 
limit in which the number of unbinding events from the 
misbound state is large, but unbinding events from the 
optimally-bound state are rare enough that they may be 
neglected. The existence of such a limit is the basis for 



the extraction of the parameter Meff discussed in Sec. IV 



Appendix B: Inter-chaperonin potential 

Model chaperonins are hard spheres of diameter 2a, 
equipped with an attractive pairwise interaction that op- 
erates only when the centres of two chaperonins lie within 
a distance 2a and 2a + a/4. Consider two chaperonins i 
and j that lie within this interaction range. Let rii and 
rij be unit vectors pointing from the centre of each chap- 
eronin to its north pole, and let rij be the unit vector 
pointing from the centre of i to the centre of j. Let 
be the angle between the orientation vectors rii and rij. 




2x10'' 4x10'' 6x10'' 




FIG. 8. System- wide fluctuations associated with nucleation 
can be controlled by using bond number (system energy) as 
a measure of assembly progress. (Top) We show yield riopt 
against time t for three independent dynamical simulations 
of the sheet-forming model considered in the main text, for 
Ch/T — 6.6. Here assembly proceeds via nucleation and 
growth of a single sheet, and the characteristic time for ap- 
pearance of the sheet is broadly distributed. (Bottom) How- 
ever, when bond number rib is used as a measure of reaction 
progress, the data collapse. This collapse reveals that the as- 
sembly mechanism in all three trajectories is the same, and 
motivates the parametric plot of Fig. |3] 



sheet-forming system described in the main text we set 
c"aiign = cTeq = 0-3. For the shcct-formiug system de- 



and let Oi be the angle between rii and Vij (and let Oj scribed in Appendix B we set (Jaiign = cTeq = 0-7. 
be the angle between rij and —Vij). Our 'sticky equator' 
systems have orientational interaction 



For the 'sticky pole' system we choose the angular in- 
teraction 



ij 5 <^align )Co(^i;CTeq)Co(%;CTeq), (Bl) 



where Ca (V^; cr) = e~^^°^^~"^ rewards the alignment 
of angles tjj and cos~^ a. The parameter a determines 
the angular tolerance of this interaction. Ca (V^; cr) = 
Ca (V^; cr) + C-a (V^; < t) i s this function's symmetrized 
counterpart. In Eq. ( |B1| ) the factors Co encourage ori- 
entation vectors to point perpendicular to the inter- 
chaperonin vector. The factor Ci encourages orienta- 
tion vectors to point parallel or antiparallel. For the 



Cpoi = — ebC'i(^ij; craiign)C'i(6>i; crpoi)Ci(6>j; cTpoi), (B2) 



whose three functions encourage alignment vectors to 
point parallel or antiparallel (function 1), and align- 
ment vectors to point parallel or antiparallel to the inter- 
chaperonin vector (functions 2 and 3). We set (Jaiign = 0.3 
and cTpoi = 0.12. 
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Appendix C: Trajectory-to-trajectory fluctuations 

The self-assembly of sheets and lattice gas clusters re- 
flects an underlying first-order phase transition, and can 
happen, roughly speaking, in one of two ways. Either a 
single critical nucleus appears in the system and grows by 
acquiring monomers, or many clusters of the new phase 
grow simultaneously and coalesce. Which of these mecha- 
nisms operates depends on the thermodynamic state and 
the system size (the latter is fixed in our simulations). 
The nucleation regime is characterised by large fluctua- 
tions: the randomly-distributed time at which the first 
critical nucleus appears strongly affects the behaviour of 
the whole system. In simulation studies, fluctuations as- 
sociated with rare nucleation events lead to substantial 
differences in values of observables such as assembly yield 
^opt(^) from run-to-run; the time-averaged yield (nopt(^)) 
is usually not representative of the behaviour of any sin- 
gle trajectory. In order to deduce the assembly mecha- 
nism it is therefore useful to use the number of bonds in 
the system, rather than time, as a reaction coordinate. 
Fig. |8] shows that data from different trajectories col- 
lapse in this representation: although the time to assem- 
bly varies significantly between trajectories, the assembly 
mechanism does not. 
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